Skip to content

Comments

Improve package scan performance#4606

Merged
AyanSinhaMahapatra merged 17 commits intodevelopfrom
fast-package-scan
Jan 12, 2026
Merged

Improve package scan performance#4606
AyanSinhaMahapatra merged 17 commits intodevelopfrom
fast-package-scan

Conversation

@AyanSinhaMahapatra
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra commented Nov 17, 2025

This PR improve package scan performance by:

  • Skipping binary package detection steps by default, and introducing a new CLI option --package-in-compiled to detect packages in compiled binaries like rust/go binaries
  • creating cached regex patterns and multiregex pre-matchers, for a fast package path detection filtering step

References:

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Updated documentation pages (if applicable)
  • Updated CHANGELOG.rst (if applicable)

Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Reference: https://github.com/Quantco/multiregex
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Use multiregex to use a cached regex path patterns and
datafile handlers mapping to detect package datafiles faster.

Reference: #4064
Reference: #4061
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
@AyanSinhaMahapatra AyanSinhaMahapatra marked this pull request as draft November 17, 2025 09:49
@AyanSinhaMahapatra AyanSinhaMahapatra changed the title Fast package scan Improve package scan performance Nov 17, 2025
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
@AyanSinhaMahapatra AyanSinhaMahapatra marked this pull request as ready for review November 19, 2025 09:47
Introduce a new option --binary-packages which looks for
package/dependency data in binaries.

Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
We do not need the license index in a --package-only scan
as this is designed to do a fast package detection only scan
which skips the license detection. As license index loading
takes a couple seconds in each case, this makes the
package only scan much faster.

Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some nits for your consideration!

Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
Signed-off-by: Ayan Sinha Mahapatra <asmahapatra@aboutcode.org>
@AyanSinhaMahapatra
Copy link
Member Author

Review comments has been addressed, merging!

@AyanSinhaMahapatra AyanSinhaMahapatra merged commit fa63f1a into develop Jan 12, 2026
38 of 39 checks passed
@AyanSinhaMahapatra AyanSinhaMahapatra deleted the fast-package-scan branch January 12, 2026 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants